Scene Description from Images to Sentences

نویسندگان

  • Khushali Acharya
  • Abhinay Pandya
چکیده

1,Computer Engineering LDRP-ITR Gandhinagar, India 2Prof & HOD, Information Technology LDRP-ITR Gandhinagar, India ---------------------------------------------------------------------***--------------------------------------------------------------------Abstract— People exchange their views using language, whether spoken, written, or typed. A notable amount of this language describes the environment around us, especially the visual scenario in our surroundings or depicted in images or video. Scene description aims to generate the sentences from given set of input images. It links the visual perception with the language space. All present approaches are purely found in supervised machine learning setup. However, owing to the dearth of training data, this seldom achieves desired accuracy. We present a model that uses “Distributed Intelligence” as the prevalent theme in the artificial intelligence literature. Rather than only relying on the training dataset (PASCAL VOC 2012 containing 11530 images, FLICKR8K, FLICKR30K & MSCOCO), we harness the power of internet in order to generate more precise sentences related to the images.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge

In this paper we propose the construction of linguistic descriptions of images. This is achieved through the extraction of scene description graphs (SDGs) from visual scenes using an automatically constructed knowledge base. SDGs are constructed using both vision and reasoning. Specifically, commonsense reasoning1 is applied on (a) detections obtained from existing perception methods on given i...

متن کامل

Color scene transform between images using Rosenfeld-Kak histogram matching method

In digital color imaging, it is of interest to transform the color scene of an image to the other. Some attempts have been done in this case using, for example, lαβ color space, principal component analysis and recently histogram rescaling method. In this research, a novel method is proposed based on the Resenfeld and Kak histogram matching algorithm. It is suggested that to transform the color...

متن کامل

Aligning where to see and what to tell: image caption with region-based attention and scene factorization

Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this paper, we propose an image caption system that exploits the parallel structures between images and sentences. In our model, the process of generating the next word, given the previously generated ones,...

متن کامل

DISCO: Describing Images Using Scene Contexts and Objects

In this paper, we propose a bottom-up approach to generating short descriptive sentences from images, to enhance scene understanding. We demonstrate automatic methods for mapping the visual content in an image to natural spoken or written language. We also introduce a human-in-the-loop evaluation strategy that quantitatively captures the meaningfulness of the generated sentences. We recorded a ...

متن کامل

Parsing Natural Scenes and Natural Language with Recursive Neural Networks

Recursive structure is commonly found in the inputs of different modalities such as natural scene images or natural language sentences. Discovering this recursive structure helps us to not only identify the units that an image or sentence contains but also how they interact to form a whole. We introduce a max-margin structure prediction architecture based on recursive neural networks that can s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017